Path: blob/master/Part 3 - Classification/K Nearest Neighbors/[Python] K-Nearest Neighbour.ipynb
1009 views
Kernel: Python 3
K-Nearest Neighbour
Data preprocessing
In [60]:
In [13]:
In [15]:
Out[15]:
In [38]:
In [39]:
Out[39]:
array([[ 27, 57000],
[ 46, 28000],
[ 39, 134000],
[ 44, 39000],
[ 57, 26000],
[ 32, 120000],
[ 41, 52000],
[ 48, 74000],
[ 26, 86000],
[ 22, 81000]])
In [40]:
Out[40]:
array([[ 46, 22000],
[ 59, 88000],
[ 28, 44000],
[ 48, 96000],
[ 29, 28000],
[ 30, 62000],
[ 47, 107000],
[ 29, 83000],
[ 40, 75000],
[ 42, 65000]])
In [41]:
Out[41]:
array([0, 1, 1, 0, 1, 1, 0, 1, 0, 0])
In [42]:
Out[42]:
array([0, 1, 0, 1, 0, 0, 1, 0, 0, 0])
In [44]:
In [45]:
Out[45]:
array([[-1.06675246, -0.38634438],
[ 0.79753468, -1.22993871],
[ 0.11069205, 1.853544 ],
[ 0.60129393, -0.90995465],
[ 1.87685881, -1.28811763],
[-0.57615058, 1.44629156],
[ 0.3069328 , -0.53179168],
[ 0.99377543, 0.10817643],
[-1.16487283, 0.45724994],
[-1.55735433, 0.31180264]])
In [46]:
Out[46]:
array([[ 0.79753468, -1.40447546],
[ 2.07309956, 0.51542886],
[-0.96863208, -0.76450736],
[ 0.99377543, 0.74814454],
[-0.87051171, -1.22993871],
[-0.77239133, -0.24089709],
[ 0.89565505, 1.06812859],
[-0.87051171, 0.36998156],
[ 0.20881242, 0.13726589],
[ 0.40505317, -0.15362871]])
Fitting classifier to the Training set
In [47]:
Out[47]:
KNeighborsClassifier(algorithm='auto', leaf_size=30, metric='minkowski',
metric_params=None, n_jobs=1, n_neighbors=5, p=2,
weights='uniform')
Predicting the Test set results
In [48]:
In [49]:
Out[49]:
array([1, 1, 0, 1, 0, 0, 1, 0, 0, 0])
In [50]:
Out[50]:
array([0, 1, 0, 1, 0, 0, 1, 0, 0, 0])
Predictions are almost correct.
In [51]:
Out[51]:
array([[48, 4],
[ 3, 25]])
That's awesome. Only 3 + 4 = 7, incorrect prediction and 64 + 29 = 93 correct prediction.
Visualising the Training set results
In [61]:
Out[61]:
Visualising the Test set results
In [62]:
Out[62]:
Gist: K-NN is a non Linear Classifier. That's why it predicts so well in our decision making problem.